{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "sparsity_and_l1_regularization.ipynb",
      "version": "0.3.2",
      "views": {},
      "default_view": {},
      "provenance": [],
      "collapsed_sections": [
        "yjUCX5LAkxAX",
        "copyright-notice"
      ]
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "metadata": {
        "id": "copyright-notice",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        "#### Copyright 2017 Google LLC."
      ]
    },
    {
      "metadata": {
        "colab": {
          "autoexec": {
            "wait_interval": 0,
            "startup": false
          }
        },
        "id": "copyright-notice2",
        "colab_type": "code",
        "cellView": "both"
      },
      "cell_type": "code",
      "outputs": [],
      "execution_count": 0,
      "source": [
        "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "metadata": {
        "id": "g4T-_IsVbweU",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " # Parcimonie et r\u00e9gularisation\u00a0L1"
      ]
    },
    {
      "metadata": {
        "id": "g8ue2FyFIjnQ",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " **Objectifs d'apprentissage\u00a0:**\n",
        "  * Calculer la taille d'un mod\u00e8le\n",
        "  * Appliquer une r\u00e9gularisation\u00a0L1 afin de r\u00e9duire la taille d'un mod\u00e8le en augmentant la parcimonie"
      ]
    },
    {
      "metadata": {
        "id": "ME_WXE7cIjnS",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Certaines fonctions de r\u00e9gularisation peuvent faire en sorte que des pond\u00e9rations soient exactement \u00e9gales \u00e0 z\u00e9ro. Il est donc judicieux d'utiliser une telle r\u00e9gularisation pour rendre les op\u00e9rations moins complexes. Dans le cas des mod\u00e8les lin\u00e9aires, comme la r\u00e9gression, une pond\u00e9ration nulle \u00e9quivaut \u00e0 ne pas utiliser la caract\u00e9ristique correspondante. Outre le fait que cela permet d'\u00e9viter le surapprentissage, le mod\u00e8le obtenu s'av\u00e8re plus efficace.\n",
        "\n",
        "La r\u00e9gularisation\u00a0L1 est un bon moyen d'augmenter la parcimonie.\n"
      ]
    },
    {
      "metadata": {
        "id": "fHRzeWkRLrHF",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Configuration\n",
        "\n",
        "Ex\u00e9cutez les cellules ci-dessous pour charger les donn\u00e9es et cr\u00e9er des d\u00e9finitions de caract\u00e9ristique."
      ]
    },
    {
      "metadata": {
        "id": "pb7rSrLKIjnS",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "from __future__ import print_function\n",
        "\n",
        "import math\n",
        "\n",
        "from IPython import display\n",
        "from matplotlib import cm\n",
        "from matplotlib import gridspec\n",
        "from matplotlib import pyplot as plt\n",
        "import numpy as np\n",
        "import pandas as pd\n",
        "from sklearn import metrics\n",
        "import tensorflow as tf\n",
        "from tensorflow.python.data import Dataset\n",
        "\n",
        "tf.logging.set_verbosity(tf.logging.ERROR)\n",
        "pd.options.display.max_rows = 10\n",
        "pd.options.display.float_format = '{:.1f}'.format\n",
        "\n",
        "california_housing_dataframe = pd.read_csv(\"https://download.mlcc.google.com/mledu-datasets/california_housing_train.csv\", sep=\",\")\n",
        "\n",
        "california_housing_dataframe = california_housing_dataframe.reindex(\n",
        "    np.random.permutation(california_housing_dataframe.index))"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "3V7q8jk0IjnW",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def preprocess_features(california_housing_dataframe):\n",
        "  \"\"\"Prepares input features from California housing data set.\n",
        "\n",
        "  Args:\n",
        "    california_housing_dataframe: A Pandas DataFrame expected to contain data\n",
        "      from the California housing data set.\n",
        "  Returns:\n",
        "    A DataFrame that contains the features to be used for the model, including\n",
        "    synthetic features.\n",
        "  \"\"\"\n",
        "  selected_features = california_housing_dataframe[\n",
        "    [\"latitude\",\n",
        "     \"longitude\",\n",
        "     \"housing_median_age\",\n",
        "     \"total_rooms\",\n",
        "     \"total_bedrooms\",\n",
        "     \"population\",\n",
        "     \"households\",\n",
        "     \"median_income\"]]\n",
        "  processed_features = selected_features.copy()\n",
        "  # Create a synthetic feature.\n",
        "  processed_features[\"rooms_per_person\"] = (\n",
        "    california_housing_dataframe[\"total_rooms\"] /\n",
        "    california_housing_dataframe[\"population\"])\n",
        "  return processed_features\n",
        "\n",
        "def preprocess_targets(california_housing_dataframe):\n",
        "  \"\"\"Prepares target features (i.e., labels) from California housing data set.\n",
        "\n",
        "  Args:\n",
        "    california_housing_dataframe: A Pandas DataFrame expected to contain data\n",
        "      from the California housing data set.\n",
        "  Returns:\n",
        "    A DataFrame that contains the target feature.\n",
        "  \"\"\"\n",
        "  output_targets = pd.DataFrame()\n",
        "  # Create a boolean categorical feature representing whether the\n",
        "  # median_house_value is above a set threshold.\n",
        "  output_targets[\"median_house_value_is_high\"] = (\n",
        "    california_housing_dataframe[\"median_house_value\"] > 265000).astype(float)\n",
        "  return output_targets"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "pAG3tmgwIjnY",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "# Choose the first 12000 (out of 17000) examples for training.\n",
        "training_examples = preprocess_features(california_housing_dataframe.head(12000))\n",
        "training_targets = preprocess_targets(california_housing_dataframe.head(12000))\n",
        "\n",
        "# Choose the last 5000 (out of 17000) examples for validation.\n",
        "validation_examples = preprocess_features(california_housing_dataframe.tail(5000))\n",
        "validation_targets = preprocess_targets(california_housing_dataframe.tail(5000))\n",
        "\n",
        "# Double-check that we've done the right thing.\n",
        "print(\"Training examples summary:\")\n",
        "display.display(training_examples.describe())\n",
        "print(\"Validation examples summary:\")\n",
        "display.display(validation_examples.describe())\n",
        "\n",
        "print(\"Training targets summary:\")\n",
        "display.display(training_targets.describe())\n",
        "print(\"Validation targets summary:\")\n",
        "display.display(validation_targets.describe())"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "gHkniRI1Ijna",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def my_input_fn(features, targets, batch_size=1, shuffle=True, num_epochs=None):\n",
        "    \"\"\"Trains a linear regression model.\n",
        "  \n",
        "    Args:\n",
        "      features: pandas DataFrame of features\n",
        "      targets: pandas DataFrame of targets\n",
        "      batch_size: Size of batches to be passed to the model\n",
        "      shuffle: True or False. Whether to shuffle the data.\n",
        "      num_epochs: Number of epochs for which data should be repeated. None = repeat indefinitely\n",
        "    Returns:\n",
        "      Tuple of (features, labels) for next data batch\n",
        "    \"\"\"\n",
        "  \n",
        "    # Convert pandas data into a dict of np arrays.\n",
        "    features = {key:np.array(value) for key,value in dict(features).items()}                                            \n",
        " \n",
        "    # Construct a dataset, and configure batching/repeating.\n",
        "    ds = Dataset.from_tensor_slices((features,targets)) # warning: 2GB limit\n",
        "    ds = ds.batch(batch_size).repeat(num_epochs)\n",
        "    \n",
        "    # Shuffle the data, if specified.\n",
        "    if shuffle:\n",
        "      ds = ds.shuffle(10000)\n",
        "    \n",
        "    # Return the next batch of data.\n",
        "    features, labels = ds.make_one_shot_iterator().get_next()\n",
        "    return features, labels"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "bLzK72jkNJPf",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def get_quantile_based_buckets(feature_values, num_buckets):\n",
        "  quantiles = feature_values.quantile(\n",
        "    [(i+1.)/(num_buckets + 1.) for i in range(num_buckets)])\n",
        "  return [quantiles[q] for q in quantiles.keys()]"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "al2YQpKyIjnd",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def construct_feature_columns():\n",
        "  \"\"\"Construct the TensorFlow Feature Columns.\n",
        "\n",
        "  Returns:\n",
        "    A set of feature columns\n",
        "  \"\"\"\n",
        "\n",
        "  bucketized_households = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"households\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"households\"], 10))\n",
        "  bucketized_longitude = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"longitude\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"longitude\"], 50))\n",
        "  bucketized_latitude = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"latitude\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"latitude\"], 50))\n",
        "  bucketized_housing_median_age = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"housing_median_age\"),\n",
        "    boundaries=get_quantile_based_buckets(\n",
        "      training_examples[\"housing_median_age\"], 10))\n",
        "  bucketized_total_rooms = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"total_rooms\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"total_rooms\"], 10))\n",
        "  bucketized_total_bedrooms = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"total_bedrooms\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"total_bedrooms\"], 10))\n",
        "  bucketized_population = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"population\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"population\"], 10))\n",
        "  bucketized_median_income = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"median_income\"),\n",
        "    boundaries=get_quantile_based_buckets(training_examples[\"median_income\"], 10))\n",
        "  bucketized_rooms_per_person = tf.feature_column.bucketized_column(\n",
        "    tf.feature_column.numeric_column(\"rooms_per_person\"),\n",
        "    boundaries=get_quantile_based_buckets(\n",
        "      training_examples[\"rooms_per_person\"], 10))\n",
        "\n",
        "  long_x_lat = tf.feature_column.crossed_column(\n",
        "    set([bucketized_longitude, bucketized_latitude]), hash_bucket_size=1000)\n",
        "\n",
        "  feature_columns = set([\n",
        "    long_x_lat,\n",
        "    bucketized_longitude,\n",
        "    bucketized_latitude,\n",
        "    bucketized_housing_median_age,\n",
        "    bucketized_total_rooms,\n",
        "    bucketized_total_bedrooms,\n",
        "    bucketized_population,\n",
        "    bucketized_households,\n",
        "    bucketized_median_income,\n",
        "    bucketized_rooms_per_person])\n",
        "  \n",
        "  return feature_columns"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "hSBwMrsrE21n",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## Calculer la taille du mod\u00e8le\n",
        "\n",
        "Pour calculer la taille du mod\u00e8le, vous allez simplement compter le nombre de param\u00e8tres non\u00a0nuls. Une fonction est mise \u00e0 votre disposition pour vous faciliter la t\u00e2che. Cette fonction met en pratique une connaissance approfondie de l'API\u00a0Estimators. N'essayez pas de comprendre comment elle fonctionne\u00a0; cela n'est pas n\u00e9cessaire."
      ]
    },
    {
      "metadata": {
        "id": "e6GfTI0CFhB8",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def model_size(estimator):\n",
        "  variables = estimator.get_variable_names()\n",
        "  size = 0\n",
        "  for variable in variables:\n",
        "    if not any(x in variable \n",
        "               for x in ['global_step',\n",
        "                         'centered_bias_weight',\n",
        "                         'bias_weight',\n",
        "                         'Ftrl']\n",
        "              ):\n",
        "      size += np.count_nonzero(estimator.get_variable_value(variable))\n",
        "  return size"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "XabdAaj67GfF",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ## R\u00e9duire la taille du mod\u00e8le\n",
        "\n",
        "Votre \u00e9quipe doit construire un mod\u00e8le de r\u00e9gression logistique tr\u00e8s pr\u00e9cis sur le *SmartRing*, un anneau tellement intelligent qu'il est capable de \"d\u00e9tecter\" les donn\u00e9es d\u00e9mographiques d'un \u00eelot urbain ('median_income', 'avg_rooms', 'households', etc.) et de vous dire s'il s'agit d'un quartier cher ou non.\n",
        "\n",
        "Compte tenu de la petite taille du SmartRing, l'\u00e9quipe d'ing\u00e9nieurs a d\u00e9termin\u00e9 qu'il ne pouvait traiter qu'un mod\u00e8le **ne comportant pas plus de 600\u00a0param\u00e8tres**. L'\u00e9quipe en charge de la gestion des produits a, pour sa part, \u00e9tabli que le mod\u00e8le ne pourrait pas \u00eatre lanc\u00e9 tant que la **perte logistique ne serait pas inf\u00e9rieure \u00e0 0,35** sur l'ensemble d'\u00e9valuation mis de c\u00f4t\u00e9.\n",
        "\n",
        "Pouvez-vous utiliser votre arme secr\u00e8te, \u00e0 savoir la r\u00e9gularisation\u00a0L1, pour optimiser le mod\u00e8le afin qu'il r\u00e9ponde \u00e0 la fois aux contraintes de taille et de justesse\u00a0?"
      ]
    },
    {
      "metadata": {
        "id": "G79hGRe7qqej",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### T\u00e2che\u00a01\u00a0: Trouver un coefficient de r\u00e9gularisation satisfaisant\n",
        "\n",
        "**Trouvez un param\u00e8tre d'intensit\u00e9 de r\u00e9gularisation\u00a0L1 qui r\u00e9ponde aux deux contraintes, \u00e0 savoir une taille de mod\u00e8le inf\u00e9rieure \u00e0 600 et une perte logistique inf\u00e9rieure \u00e0 0,35 sur l'ensemble de validation.**\n",
        "\n",
        "Le code ci-dessous vous aidera \u00e0 bien commencer. Vous pouvez appliquer la r\u00e9gularisation \u00e0 votre mod\u00e8le de diff\u00e9rentes mani\u00e8res. Dans le cas pr\u00e9sent, vous allez utiliser `FtrlOptimizer`, une classe con\u00e7ue pour offrir de meilleurs r\u00e9sultats que la descente de gradient standard avec la r\u00e9gularisation\u00a0L1.\n",
        "\n",
        "Ici encore, l'entra\u00eenement du mod\u00e8le est effectu\u00e9 sur l'int\u00e9gralit\u00e9 de l'ensemble de donn\u00e9es. Attendez-vous donc \u00e0 ce qu'il s'ex\u00e9cute plus lentement que d'habitude."
      ]
    },
    {
      "metadata": {
        "id": "1Fcdm0hpIjnl",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "def train_linear_classifier_model(\n",
        "    learning_rate,\n",
        "    regularization_strength,\n",
        "    steps,\n",
        "    batch_size,\n",
        "    feature_columns,\n",
        "    training_examples,\n",
        "    training_targets,\n",
        "    validation_examples,\n",
        "    validation_targets):\n",
        "  \"\"\"Trains a linear regression model.\n",
        "  \n",
        "  In addition to training, this function also prints training progress information,\n",
        "  as well as a plot of the training and validation loss over time.\n",
        "  \n",
        "  Args:\n",
        "    learning_rate: A `float`, the learning rate.\n",
        "    regularization_strength: A `float` that indicates the strength of the L1\n",
        "       regularization. A value of `0.0` means no regularization.\n",
        "    steps: A non-zero `int`, the total number of training steps. A training step\n",
        "      consists of a forward and backward pass using a single batch.\n",
        "    feature_columns: A `set` specifying the input feature columns to use.\n",
        "    training_examples: A `DataFrame` containing one or more columns from\n",
        "      `california_housing_dataframe` to use as input features for training.\n",
        "    training_targets: A `DataFrame` containing exactly one column from\n",
        "      `california_housing_dataframe` to use as target for training.\n",
        "    validation_examples: A `DataFrame` containing one or more columns from\n",
        "      `california_housing_dataframe` to use as input features for validation.\n",
        "    validation_targets: A `DataFrame` containing exactly one column from\n",
        "      `california_housing_dataframe` to use as target for validation.\n",
        "      \n",
        "  Returns:\n",
        "    A `LinearClassifier` object trained on the training data.\n",
        "  \"\"\"\n",
        "\n",
        "  periods = 7\n",
        "  steps_per_period = steps / periods\n",
        "\n",
        "  # Create a linear classifier object.\n",
        "  my_optimizer = tf.train.FtrlOptimizer(learning_rate=learning_rate, l1_regularization_strength=regularization_strength)\n",
        "  my_optimizer = tf.contrib.estimator.clip_gradients_by_norm(my_optimizer, 5.0)\n",
        "  linear_classifier = tf.estimator.LinearClassifier(\n",
        "      feature_columns=feature_columns,\n",
        "      optimizer=my_optimizer\n",
        "  )\n",
        "  \n",
        "  # Create input functions.\n",
        "  training_input_fn = lambda: my_input_fn(training_examples, \n",
        "                                          training_targets[\"median_house_value_is_high\"], \n",
        "                                          batch_size=batch_size)\n",
        "  predict_training_input_fn = lambda: my_input_fn(training_examples, \n",
        "                                                  training_targets[\"median_house_value_is_high\"], \n",
        "                                                  num_epochs=1, \n",
        "                                                  shuffle=False)\n",
        "  predict_validation_input_fn = lambda: my_input_fn(validation_examples, \n",
        "                                                    validation_targets[\"median_house_value_is_high\"], \n",
        "                                                    num_epochs=1, \n",
        "                                                    shuffle=False)\n",
        "  \n",
        "  # Train the model, but do so inside a loop so that we can periodically assess\n",
        "  # loss metrics.\n",
        "  print(\"Training model...\")\n",
        "  print(\"LogLoss (on validation data):\")\n",
        "  training_log_losses = []\n",
        "  validation_log_losses = []\n",
        "  for period in range (0, periods):\n",
        "    # Train the model, starting from the prior state.\n",
        "    linear_classifier.train(\n",
        "        input_fn=training_input_fn,\n",
        "        steps=steps_per_period\n",
        "    )\n",
        "    # Take a break and compute predictions.\n",
        "    training_probabilities = linear_classifier.predict(input_fn=predict_training_input_fn)\n",
        "    training_probabilities = np.array([item['probabilities'] for item in training_probabilities])\n",
        "    \n",
        "    validation_probabilities = linear_classifier.predict(input_fn=predict_validation_input_fn)\n",
        "    validation_probabilities = np.array([item['probabilities'] for item in validation_probabilities])\n",
        "    \n",
        "    # Compute training and validation loss.\n",
        "    training_log_loss = metrics.log_loss(training_targets, training_probabilities)\n",
        "    validation_log_loss = metrics.log_loss(validation_targets, validation_probabilities)\n",
        "    # Occasionally print the current loss.\n",
        "    print(\"  period %02d : %0.2f\" % (period, validation_log_loss))\n",
        "    # Add the loss metrics from this period to our list.\n",
        "    training_log_losses.append(training_log_loss)\n",
        "    validation_log_losses.append(validation_log_loss)\n",
        "  print(\"Model training finished.\")\n",
        "\n",
        "  # Output a graph of loss metrics over periods.\n",
        "  plt.ylabel(\"LogLoss\")\n",
        "  plt.xlabel(\"Periods\")\n",
        "  plt.title(\"LogLoss vs. Periods\")\n",
        "  plt.tight_layout()\n",
        "  plt.plot(training_log_losses, label=\"training\")\n",
        "  plt.plot(validation_log_losses, label=\"validation\")\n",
        "  plt.legend()\n",
        "\n",
        "  return linear_classifier"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "9H1CKHSzIjno",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "linear_classifier = train_linear_classifier_model(\n",
        "    learning_rate=0.1,\n",
        "    # TWEAK THE REGULARIZATION VALUE BELOW\n",
        "    regularization_strength=0.0,\n",
        "    steps=300,\n",
        "    batch_size=100,\n",
        "    feature_columns=construct_feature_columns(),\n",
        "    training_examples=training_examples,\n",
        "    training_targets=training_targets,\n",
        "    validation_examples=validation_examples,\n",
        "    validation_targets=validation_targets)\n",
        "print(\"Model size:\", model_size(linear_classifier))"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    },
    {
      "metadata": {
        "id": "yjUCX5LAkxAX",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " ### Solution\n",
        "\n",
        "Cliquez ci-dessous pour afficher une solution."
      ]
    },
    {
      "metadata": {
        "id": "hgGhy-okmkWL",
        "colab_type": "text"
      },
      "cell_type": "markdown",
      "source": [
        " Une intensit\u00e9 de r\u00e9gularisation de 0,1 doit normalement suffire. Notez toutefois qu'il faut trouver un compromis\u00a0:\n",
        "une r\u00e9gularisation plus pouss\u00e9e g\u00e9n\u00e8re des mod\u00e8les plus petits, mais risque d'affecter le co\u00fbt de classification."
      ]
    },
    {
      "metadata": {
        "id": "_rV8YQWZIjns",
        "colab_type": "code",
        "colab": {
          "autoexec": {
            "startup": false,
            "wait_interval": 0
          }
        }
      },
      "source": [
        "linear_classifier = train_linear_classifier_model(\n",
        "    learning_rate=0.1,\n",
        "    regularization_strength=0.1,\n",
        "    steps=300,\n",
        "    batch_size=100,\n",
        "    feature_columns=construct_feature_columns(),\n",
        "    training_examples=training_examples,\n",
        "    training_targets=training_targets,\n",
        "    validation_examples=validation_examples,\n",
        "    validation_targets=validation_targets)\n",
        "print(\"Model size:\", model_size(linear_classifier))"
      ],
      "cell_type": "code",
      "execution_count": 0,
      "outputs": []
    }
  ]
}
